Landslide Susceptibility Mapping Using Active Remote Sensing Data with Interpretable Machine Learning

Suh, Yong-hun

Table of Contents

1. Self-Introduction

2. What I have done

  • Research Background and Objectives
  • Literature Review
  • Data and Methodology
  • Results

3. Future Plans

Self-Introduction

Introduction to Myself?

Name: Yong-hun Suh

Current Role: Ph.D. Intern at the GIS Team

Education

  • M.A. in Geography, Seoul National University
  • B.A. in Geography & B.Sc. in Atmospheric Science, Kongju National University

Currently Involved Project

  • UB Clean Air (as a research assoiate)
    • State University of New York @ Buffalo(UB)
    • Funded by U.S. Environmental Protection Agency(EPA)

📸 From Göreme, Turkey

📸 From Göreme, Turkey

Research Interests

Fields

  • Geographic Information System/Science (that involves)
    • Remote Sensing
    • Machine Learning
    • High-Performance Computing (hopefully in the near future)


Previous Expreience

  • Research on landslide susceptibility, population density estimation, and air pollution impacts on humanity
  • For more details, please visit my website here!

What I have done

  • Landslide Susceptibility Mapping Using Active Remote Sensing Data with Interpretable Machine Learning

Good Prerequisites to Know

What is Active Remote Sensing (ARS)

  • Uses active sensors that emit their own energy
  • Measures the reflected or backscattered signal from targets
  • Operates independently of sunlight
  • Can penetrate cloud cover and some vegetation

Some Examples of ARS

  • RAdio Detection And Ranging (RADAR)
  • LIght Detection And Ranging (LIDAR)

Source: NASA GPM

Source: NASA GPM

Good Prerequisites to Know

Active vs Passive Remote Sensing (in terms of ground monitoring sensors)

📡️ Active Sensing

  • Uses slant range geometry
  • Emits its own signal
    • ✅ Operates in any weather or light conditions
    • ❌ Requires complex post processing to retrieve useful information from noises
  • Measures backscatter or phase difference
    • ✅ Suitable for structural analysis (e.g., terrain, infrastructure)
    • ❌ Less effective for material composition or color

🌞 Passive Sensing

  • Uses nadir (i.e., vertical)-based reflectance
    • ✅ Offers high spectral resolution (e.g. vegetation indices)
    • ❌ Limited to sunlit conditions, vulnerable to clouds
  • Relies on solar illumination
    • ✅ Simpler calibration than the Active Sensing
    • ❌ No data acquisition at night or under dense cloud cover
  • Measures reflected solar energy
    • ✅ Great for land cover classification
    • ❌ No range information provided

Research Background

Increasing landslide risk

  • Due to climate change and frequent extreme rainfall events (World Bank, 2020)


Potential of Active Remote Sensing (ARS)

  • Creep precedes slope failure (Intrieri et al., 2018; Kilburn & Petley, 2003)
    → Slope surface displacement by Interferometric SAR
  • Importance of rainfall intensity (Tang et al., 2023; Chen et al., 2015)
    De facto precipitation intensity by weather radar

Trends in Landslide Recovery Costs and Damage Volume in Korea (Korea Forest Service, 2023)

Trends in Landslide Recovery Costs and Damage Volume in Korea (Korea Forest Service, 2023)

Taking advantage of Active Remote Sensing

  • Landslides are closely related with heavy precipitation
    Presence of thick clouds!
  • ARS can be useful for predicting susceptibility of landslides

Study Area

Uljin County, Gyeongsangbuk-do

  • The Gyeongsang-do1 region has experienced the highest level of landslide damage over the past decade (Korea Forest Service, 2022).

  • As of June 2023, Gyeongsangbuk-do recorded (Park, 2023):

    • 4,935 landslide-prone areas (17.7% of the national total)

    • 9,977 residents living in vulnerable zones (13.8%)

  • Based on these figures, this study selected Uljin County in Gyeongsangbuk-do as the case study area.

Study Area, click to enlarge

Study Area, click to enlarge

Data Used (Cont’d)

Landslide Modeling Variables

  • Non-landslide samples were selected using stratified sampling to match the number of landslide occurrences.

  • Displacement and rainfall data were collected per event date and assigned accordingly

  • All variables were converted to 10m raster for the susceptibility map (i.e., predictions)

Table: List of Variables Used in the Landslide Susceptibility Analysis

Variable Type Variable Name Spatial Resolution Type Source
Feature Elevation (DEM) 30m Continuous USGS1
Slope 30m Continuous -
Aspect 30m Categorical -
Topographic Wetness Index (TWI) 30m Continuous -
Ground Displacement 80m Continuous ASF DAAC2
NDVI 10m Continuous ESA3
Max. Daily Rainfall Intensity 500m Continuous KMA4
48h Antecedent Rainfall 500m Continuous KMA
Soil Type (Generalized) 30m Categorical RDA5
Distance from Faults 30m Continuous KIGAM6
Distance from Rivers 30m Continuous MOLIT7
Distance from Roads 30m Continuous MOLIT
Target Landslide Inventory - Binary MOIS8

Data Used (Cont’d)

Landslide Inventory

  • A total of 9,656 landslide events recorded nationwide.

    • Spans from June 29, 2011, to June 23, 2022
    • Includes coordinates, date/time of occurrence, and cause
    • 363 cases in the study area (Uljin County), across two periods
  • Obtained from the publicly available Korea Safety Map by MOIS.

    • Acquired via web crawling1 of the Korea Safety Map web app using R package RSelenium


 Korea Safety Map


Korea Safety Map

The Landslide Inventory

The Landslide Inventory

Data Used (Cont’d)

Ground Displacement Data

  • Captures ground motion in the satellite sensor’s line-of-sight(LoS) between two passes over the same area.

    • Used Interferometric Synthetic Aperture Radar (InSAR) based on phase differences between two radar images.
  • Variable Details

    • Processed from two Single Look Complex (SLC) images prior to each event date.
      →Assigned as the observation value for each event date.

    • Absolute values taken to simplify interpretation.

    • Radar space resolution: 2.3 × 14.1, converted to 80m × 80m in geographic space.

    • Unit: meters per temporal baseline (days).

Pairs
(Direction)
Data Type Acquisition Date Perpendicular Baseline Temporal Baseline
1
(ASCENDING)
SLC 07SEP19 41.123m 18 days
25SEP19
2
(ASCENDING)
SLC 26AUG20 84.548m 12 days
07SEP20

The LoS deformation data

The LoS deformation data

Data Used

HSR (Hybrid Surface Rainfall) Data

  • 5-minute intervals and 500m × 500m spatial resolutions

  • Offers a more realistic estimate of rainfall compared to interpolated rain gauge data

  • Variable Details

    • For each case date, the maximum rainfall intensity and 48-hour antecedent rainfall were extracted as observed values

    • Reflectivity was converted to rainfall using the Z-R relationship1


Maximum Rainfall Intensity

Maximum Rainfall Intensity

48-Hour Antecedent Rainfall

48-Hour Antecedent Rainfall

Methodology Overview

  • Sampling: Stratified sampling
  • Models: XGBoost, Random Forest, Support Vector Machines(RBF, Poly., Lin.), Logistic Regression(elastic net reg.)
  • Training & Validation: Spatial resampling and cross-validation(k=10, 2 reps)
  • Interpretation: SHAP analysis for feature importance & partial dependence plots

Flowchart of the modeling

Flowchart of the modeling

Methodology Overview (Cont’d)

Landslide Susceptibility Modeling

  • Multiple machine learning models were employed to identify the optimal model.
  • Hyperparameter tuning was conducted to improve model performance and prevent underfitting and overfitting.
    • A random search strategy was used with n = 200 iterations.
    • Spatial resampling was applied to reduce the bias caused by spatial autocorrelation: ensuring spatial separation between training and validation data. → This approach reduces bias and helps prevent overfitting (Lovelace et al., 2019)


Spatial Resampling

Spatial Resampling

Methodology Overview

Model Performance Evaluation

  • Instances were split into 7:3 (training:testing) for model evaluation
  • Models tested: Logistic Regression, SVMs, Random Forest, and XGBoost
  • Metrics used: Accuracy, Specificity, Sensitivity, F1-score, AUC-ROC

Interpretable Machine Learning with SHAP

  • SHAP was used to interpret model predictions
  • Used TreeSHAP which is a bit more robust with feature correlation (Molnar, 2020)
  • SHAP-based feature importance was used to remove less relevant features, improving model performance (Inan & Rahman, 2023)

Results (Cont’d)

Variable Reduction Using SHAP

  • Initial model fitting using all features
    • Among the six models, XGB and RF showed the best performance (①, ④)
  • Re-modeling using the top 10 most important features (③)
    • The top 10 features were identical for both XGB and RF models
  • Among the six models, XGBoost showed the best overall performance and was selected as the final model (②, ④)
  • XGBoost outperformed Random Forest in all metrics except specificity1 (②)
①. Performance metrics for initial XGB and RF
Metric XGB RF
Accuracy 0.812 0.803
Sensitivity 0.835 0.789
Specificity 0.789 0.817
F-score 0.816 0.800
AUC ROC 0.902 0.892
②. Performance metrics for reduced XGB and RF
Metric XGB RF
Accuracy 0.821 0.794
Sensitivity 0.835 0.752
Specificity 0.807 0.835
F-score 0.824 0.785
AUC 0.906 0.893

③. Feature importance of initial RF

③. Feature importance of initial RF

④ ROC Curve Before Feature Reduction

④ ROC Curve Before Feature Reduction

& After Feature Reduction

& After Feature Reduction

Results (Cont’d)

Interpretation of Active Remote Sensing Data Using SHAP

  1. Daily maximum rainfall intensity
    • Third-highest contributor; stronger intensity generally increases landslide risk.
  2. 48h antecedent rainfall
    • Also highly influential; greater accumulation tends to raise risk, though effects are less consistent.
  3. The absolute value of surface displacement
    • Relatively lower SHAP importance, but larger displacement magnitudes are associated with higher landslide probability

Results (Cont’d)

Susceptibility Mapping with the XGBoost

  • Final model predicts landslide susceptibility at 10m resolution
  • Trained with date-specific inputs; results vary by event
  • Susceptibility updates with new data
    → Supports real-time monitoring via putting new instances

Results

Comparison with the Korea Forest Service(KFS)’s

  • Susceptibility adapts to input data, reflecting event-specific conditions
  • More context-sensitive than static susceptibility maps by the KFS


Summary of Results

  • XGBoost showed the best predictive performance
  • Final model retrained using top 10 variables
  • Rainfall-related variables had the highest impact
  • Ground displacement had moderate influence

Conclusion

  • Active remote sensing data are effective in landslide susceptibility modeling
  • The proposed approach enables real-time monitoring: more realistic than static models currently used in practice

Limitation & Future works

  • Radar improvement: Incorporate L-band over C-band for less vegetation interference
  • Topographic unit analysis: Use slope units instead of pixel-based modeling to capture the topographic characteristic better in the model
  • Model enhancement: Adopt advanced feature selection methods (e.g., Recursive Feature Elemination)

References

  • Chen, C. W., Saito, H., & Oguchi, T. (2015). Rainfall intensity–duration conditions for mass movements in Taiwan. Progress in Earth and Planetary Science, 2, 1-13.
  • Inan, M. S. K., & Rahman, I. (2023). Explainable AI Integrated Feature Selection for Landslide Susceptibility Mapping Using TreeSHAP. SN Computer Science, 4(5), 482.
  • Intrieri, E., Raspini, F., Fumagalli, A., Lu, P., Del Conte, S., Farina, P., … & Casagli, N. (2018). The Maoxian landslide as seen from space: detecting precursors of failure with Sentinel-1 data. Landslides, 15, 123-133
  • Kilburn, C. R., & Petley, D. N. (2003). Forecasting giant, catastrophic slope collapse: lessons from Vajont, Northern Italy. Geomorphology, 54(1-2), 21-32
  • Korea Forest Service. (2023). 2020 Domestic Landslide Statistics by Year of Damage Volume [Dataset]. Landslide Information System. https://sansatai.forest.go.kr/gis/main.do#mhms1
  • Lovelace, R., Nowosad, J., & Muenchow, J. (2019). Geocomputation with R. Chapman and Hall/CRC.
  • Molnar, C. (2020). Interpretable machine learning. https://christophm.github.io/interpretable-ml-book/
  • Park, M. G. (June 26, 2023). As of June, there are 28,000 landslide-prone areas nationwide, increasing every year! Gyeongbuk Damin Ilbo. http://www.hidomin.com/news/articleView.html?idxno=519220
  • Tang, Q., Gratchev, I., & Ravindran, S. (2023). Effect of rainfall intensity on landslide initiation: flume tests and numerical analysis. Geotechnics, 3(1), 104-115.

3. Future Plans

Future Research Plans at the Lab

Refugee Camp Detection

  • Apply remote sensing and machine learning techniques to detect and monitor refugee settlements from satellite imagery, contributing to humanitarian response and policy planning

Assessing the applicability of HPC in GIS

  • Utilize GPU-based computing environments to accelerate parallelizable, computationally intensive large-scale geospatial data processing

Conclusion

  • Looking Ahead:
    • Aim to contribute to the lab’s ongoing projects and expand the application of GIS
  • Acknowledgments:
    • Grateful for the opportunity to join the Team and eager to collaborate on the future research

Questions and Discussion

Contact Information

Thank you!